Wonka Bot

AI Meme Generator

What do you get when you combine someone who is not the most original of content creators (me), and an avid meme lover (me x2, or meme, haha get it?)? Of course it would be an automated meme generator! This is the journey of how I created a workflow for automating meme production.


Hand shake meme


Design

A meme is generally made up of three core components: an image or a video, labelled text, as well as a cultural reference. Taking inspiration from Imgflip's Meme Generator, I created a relevant text labeling generator that auto populates a template of my choosing, and what better template to use than one of my favorite memes: Condescending Wonka.

Implementation

Given that memes often have cultural references, I needed a way to preserve certain relationship between words in order to create meaningful text labels that best describe Wonka's condescending-ness. Seeing as how there are readily available memes of varying content already available online, I decided to curate a dataset of labels to generate original context from.

Web Scraping

I used Beautiful Soup to scrape Condescending Wonka meme labels from Meme Generator to be cleaned, pre-processed, and stored within a csv dataset. Providing over 100000 entries of training data.

Generator

The dataset is then imported, where I use NLTK to tokenize and lemmatize the labels to feed a Markov Chain in order to create original meme labels via a next word predictor strategy where I can adjust stochastic variables to determine how much relevancy is retained within the context of the generated labels.

Meme Creation

Finally, the generated text labeling is set over the stock meme template image via PIL, and Wonka Bot is complete. I added a little flare at the end by optionally replacing certain word types within the generated label with user input so as to tailor the relevancy to the user.

Outcome

After experimenting with the Wonka Bot for a while, I realized that the fun of meme making isn't about the final meme output, but about the friends and laughs we made along the way. To be fair, a Markov Chain also isn't the best at preserving all cultural relevancy in the generated labeling, and therefore can be improved upon to create more meaningful generations. In the future, I would consider the application of Large Language Models along with sculpted knowledge base to create better sampled outputs with more customizability.